18 research outputs found

    Charte Éthique et Big Data : parce que mon corpus le vaut bien !

    Get PDF
    4 pagesNational audienceUne charte Ethique et Big data a été conçue en collaboration avec des acteurs académiques et industriels, pour une meilleure documentation et de meilleures pratiques pour la constitution de corpus

    Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

    Get PDF
    International audienceThis article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform researchers, so that they can make their own choices, 2- to develop alternatives with the help of funding agencies and scientific associations, 3- to propose practical and organizational solutions in order to improve language resources development, while limiting the risks of ethical and legal issues without letting go price or quality, 4- to introduce an Ethics and Big Data Charter for the documentation of language resourc

    "Where the data are coming from?" Ethics, crowdsourcing and traceability for Big Data in Human Language Technology

    No full text
    National audienceBased on the experience gained on the observation of the corpora developement in HLT, the authors want to warn the Big Data community about some recent usage of hu-man computation. For instance, the growing use in the HLT community of crowdsourcing methods, and especially of microworking retributed crowsourcing platforms, lead to many ethical, economical and juridical concerns. The au-thors want also to foster some behaviours, especially con-cerning traceability, implemented in the form of a charter, the Ethics and Big Data Charter

    Evaluating Corpora Documentation with regards to the Ethics and Big Data Charter

    Get PDF
    International audienceThe authors have written the Ethics and Big Data Charter in collaboration with various agencies, private bodies and associations. This Charter aims at describing any large or complex resources, and in particular language resources, from a legal and ethical viewpoint and ensuring the transparency of the process of creating and distributing such resources. We propose in this article an analysis of the documentation coverage of the most frequently mentioned language resources with regards to the Charter, in order to show the benefit it offer

    Covering various Needs in Temporal Annotation: a Proposal of Extension of ISO TimeML that Preserves Upward Compatibility

    Get PDF
    International audienceThis paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to fit a larger variety of needs in NLP and in corpus linguistics. We present our proposition of some improvements of the norm before it will be revised by the ISO Committee in 2017. These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a treebank)

    "Where the data are coming from?" Ethics, crowdsourcing and traceability for Big Data in Human Language Technology

    Get PDF
    National audienceBased on the experience gained on the observation of the corpora developement in HLT, the authors want to warn the Big Data community about some recent usage of hu-man computation. For instance, the growing use in the HLT community of crowdsourcing methods, and especially of microworking retributed crowsourcing platforms, lead to many ethical, economical and juridical concerns. The au-thors want also to foster some behaviours, especially con-cerning traceability, implemented in the form of a charter, the Ethics and Big Data Charter

    Efficacy of vinblastine in central nervous system Langerhans cell histiocytosis: a nationwide retrospective study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Vinblastine (VBL) is the standard treatment for systemic Langerhans cell histiocytosis (LCH), but little is known about its efficacy in central nervous system (CNS) mass lesions.</p> <p>Methods</p> <p>A retrospective chart review was conducted. Twenty patients from the French LCH Study Group register met the inclusion criteria. In brief, they had CNS mass lesions, had been treated with VBL, and were evaluable for radiologic response.</p> <p>Results</p> <p>The median age at diagnosis of LCH was 11.5 years (range: 1-50). Intravenous VBL 6 mg/m<sup>2 </sup>was given in a 6-week induction treatment, followed by a maintenance treatment. The median total duration was 12 months (range: 3-30). Eleven patients received steroids concomitantly. Fifteen patients achieved an objective response; five had a complete response (CR: 25%), ten had a partial response (PR: 50%), four had stable disease (SD: 20%) and one patient progressed (PD: 5%). Of interest, four out of the six patients who received VBL without concomitant steroids achieved an objective response. With a median follow-up of 6.8 years, the 5-year event-free and overall survival was 61% and 84%, respectively. VBL was well-tolerated and there were no patient withdrawals due to adverse events.</p> <p>Conclusion</p> <p>VBL, with or without steroids, could potentially be a useful therapeutic option in LCH with CNS mass lesions, especially for those with inoperable lesions or multiple lesions. Prospective clinical trials are warranted for the evaluation of VBL in this indication.</p

    Caenorhabditis elegans Semi-Automated Liquid Screen Reveals a Specialized Role for the Chemotaxis Gene cheB2 in Pseudomonas aeruginosa Virulence

    Get PDF
    Pseudomonas aeruginosa is an opportunistic human pathogen that causes infections in a variety of animal and plant hosts. Caenorhabditis elegans is a simple model with which one can identify bacterial virulence genes. Previous studies with C. elegans have shown that depending on the growth medium, P. aeruginosa provokes different pathologies: slow or fast killing, lethal paralysis and red death. In this study, we developed a high-throughput semi-automated liquid-based assay such that an entire genome can readily be scanned for virulence genes in a short time period. We screened a 2,200-member STM mutant library generated in a cystic fibrosis airway P. aeruginosa isolate, TBCF10839. Twelve mutants were isolated each showing at least 70% attenuation in C. elegans killing. The selected mutants had insertions in regulatory genes, such as a histidine kinase sensor of two-component systems and a member of the AraC family, or in genes involved in adherence or chemotaxis. One mutant had an insertion in a cheB gene homologue, encoding a methylesterase involved in chemotaxis (CheB2). The cheB2 mutant was tested in a murine lung infection model and found to have a highly attenuated virulence. The cheB2 gene is part of the chemotactic gene cluster II, which was shown to be required for an optimal mobility in vitro. In P. aeruginosa, the main player in chemotaxis and mobility is the chemotactic gene cluster I, including cheB1. We show that, in contrast to the cheB2 mutant, a cheB1 mutant is not attenuated for virulence in C. elegans whereas in vitro motility and chemotaxis are severely impaired. We conclude that the virulence defect of the cheB2 mutant is not linked with a global motility defect but that instead the cheB2 gene is involved in a specific chemotactic response, which takes place during infection and is required for P. aeruginosa pathogenicity

    Evaluating Corpora Documentation with regards to the Ethics and Big Data Charter

    Get PDF
    International audienceThe authors have written the Ethics and Big Data Charter in collaboration with various agencies, private bodies and associations. This Charter aims at describing any large or complex resources, and in particular language resources, from a legal and ethical viewpoint and ensuring the transparency of the process of creating and distributing such resources. We propose in this article an analysis of the documentation coverage of the most frequently mentioned language resources with regards to the Charter, in order to show the benefit it offer
    corecore